##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Our data was very large, so we chose to focus on a few numbers that stood out to us. First, we found the total number of observations: 10199. This it is important in understanding and verifying that the data is accurate and that its findings can be applied to a greater population. Next, we looked at the number of people who used drugs while they gambled. We found this to be 717 observations from the data. This directly leads into our question about whether people who do drugs and drink alcohol, of which there were 3276 observations for alcohol users, are correlated with those more susceptible to gambling addiction. Next, we decided to look at the number of people with immediate family who have gambling problems, of which we found 1297 observations. We found this number to be greater than we thought; the ratio of people had immediately family who also had gambling problems was higher than we expected. Finally, we looked at the health problems of those who suffered from gambling addiction. We looked at how many people had any of the following conditions: (1) physical illness, (2) mental illness including depression, anxiety, PTSD, etc., (3) drug or alcohol addiction. There were 1128 total people who suffered from any of those problems. Again, we found this number of people to be surprisingly high, which made us want to look further at the data around alcohol/drug addiction and gambling addiction.
## Warning: package 'tidyverse' was built under R version 4.2.2
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.5
## ✔ tibble 3.1.8 ✔ stringr 1.4.1
## ✔ tidyr 1.2.1 ✔ forcats 0.5.2
## ✔ readr 2.1.3
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::complete() masks RCurl::complete()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## total completed missing completion_rate
## 1 10199 4707 5492 0.4615
## # A tibble: 2 × 3
## # Groups: Drug status [2]
## `Drug status` number_base percentage_base
## <int> <int> <dbl>
## 1 0 9599 94.1
## 2 1 600 5.88
## # A tibble: 2 × 4
## # Groups: Drug status [2]
## `Drug status` number_follow percentage_follow whole_percentage
## <int> <int> <dbl> <dbl>
## 1 0 4488 95.4 44
## 2 1 219 4.65 2.15
## # A tibble: 2 × 6
## # Groups: Drug status [2]
## `Drug status` number_base percentage_base number_follow percentage_f…¹ whole…²
## <int> <int> <dbl> <int> <dbl> <dbl>
## 1 0 9599 94.1 4488 95.4 44
## 2 1 600 5.88 219 4.65 2.15
## # … with abbreviated variable names ¹percentage_follow, ²whole_percentage
## # A tibble: 7 × 3
## # Groups: lottery status [7]
## `lottery status` number_base percentage_base
## <int> <int> <dbl>
## 1 0 441 4.32
## 2 1 1344 13.2
## 3 2 1726 16.9
## 4 3 2341 23.0
## 5 4 2904 28.5
## 6 5 1222 12.0
## 7 6 221 2.17
## # A tibble: 7 × 4
## # Groups: lottery status [7]
## `lottery status` number_follow percentage_follow whole_percentage
## <int> <int> <dbl> <dbl>
## 1 0 257 5.46 2.52
## 2 1 837 17.8 8.21
## 3 2 587 12.5 5.76
## 4 3 958 20.4 9.39
## 5 4 1323 28.1 13.0
## 6 5 604 12.8 5.92
## 7 6 141 3 1.38
## # A tibble: 7 × 6
## # Groups: lottery status [7]
## `lottery status` number_base percentage_base number_follow percentag…¹ whole…²
## <int> <int> <dbl> <int> <dbl> <dbl>
## 1 0 441 4.32 257 5.46 2.52
## 2 1 1344 13.2 837 17.8 8.21
## 3 2 1726 16.9 587 12.5 5.76
## 4 3 2341 23.0 958 20.4 9.39
## 5 4 2904 28.5 1323 28.1 13.0
## 6 5 1222 12.0 604 12.8 5.92
## 7 6 221 2.17 141 3 1.38
## # … with abbreviated variable names ¹percentage_follow, ²whole_percentage
## Warning: package 'plotly' was built under R version 4.2.2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
In order to determine whether gambling tendencies are correlated with drug and alcohol usage, we decided to make a chart to see the relationship between gambling tendencies and drug usage. This chart is color encoded so the top portion of the stacked bar chart are people who always used drugs while they gambled, and the bottom portion of the stacked bar chart is people who never used drugs while they gambled. There seems to be a correlation between people who gamble more and drug usage. People with no debt from gambling, or those who gambled the least out of those surveyed, reported a significantly less amount of drug usage while they gambled, while the people who had more debt from gambling reported a larger drug usage.
In order to determine whether personality has an effect on compulsive gamblers, we decided to see if there is a relationship between gambling tendencies and impulsiveness. The measure for impulsiveness was taken from the NEO Personality Index. For the gambling survey, researchers put the “Impulsiveness” measure section of the NEO index to measure a person’s impulsiveness to see if there is a correlation between impulsiveness as a personality trait and tendencies to gamble. The chart above shows the amount of self-reported debt that each gambler has as well as their impulsiveness measured on a scale of 0 points to 32 points, with 32 being the most impulsive and 0 being the least impulsive. This chart shows that while there maybe some small correlations between impulsiveness and tendencies to gamble, the correlation was likely small. The average peak of the frequencies of people in each gambling category was in a relatively similar impulsiveness scale, and even those who had more debt still didn’t report a significantly higher number on their scores for impulsiveness.
This chart shows Alcohol usage vs gambler debt. (We are a 3 person group, so this is just an extra graph for fun.)